9 research outputs found
Efficient Interactive Sound Propagation in Dynamic Environments
The physical phenomenon of sound is ubiquitous in our everyday life and is an important component of immersion in interactive virtual reality applications. Sound propagation involves modeling how sound is emitted from a source, interacts with the environment, and is received by a listener. Previous techniques for computing interactive sound propagation in dynamic scenes are based on geometric algorithms such as ray tracing. However, the performance and quality of these algorithms is strongly dependent on the number of rays traced. In addition, it is difficult to acquire acoustic material properties. It is also challenging to efficiently compute spatial sound effects from the output of ray tracing-based sound propagation. These problems lead to increased latency and less plausible sound in dynamic interactive environments. In this dissertation, we propose three approaches with the goal of addressing these challenges. First, we present an approach that utilizes temporal coherence in the sound field to reuse computation from previous simulation time steps. Secondly, we present a framework for the automatic acquisition of acoustic material properties using visual and audio measurements of real-world environments. Finally, we propose efficient techniques for computing directional spatial sound for sound propagation with low latency using head-related transfer functions (HRTF). We have evaluated both the performance and subjective impact of these techniques on a variety of complex dynamic indoor and outdoor environments and observe an order-of-magnitude speedup over previous approaches. The accuracy of our approaches has been validated against real-world measurements and previous methods. The proposed techniques enable interactive simulation of sound propagation in complex multi-source dynamic environments.Doctor of Philosoph
SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
We introduce SoundSpaces 2.0, a platform for on-the-fly geometry-based audio
rendering for 3D environments. Given a 3D mesh of a real-world environment,
SoundSpaces can generate highly realistic acoustics for arbitrary sounds
captured from arbitrary microphone locations. Together with existing 3D visual
assets, it supports an array of audio-visual research tasks, such as
audio-visual navigation, mapping, source localization and separation, and
acoustic matching. Compared to existing resources, SoundSpaces 2.0 has the
advantages of allowing continuous spatial sampling, generalization to novel
environments, and configurable microphone and material properties. To our
knowledge, this is the first geometry-based acoustic simulation that offers
high fidelity and realism while also being fast enough to use for embodied
learning. We showcase the simulator's properties and benchmark its performance
against real-world audio measurements. In addition, we demonstrate two
downstream tasks -- embodied navigation and far-field automatic speech
recognition -- and highlight sim2real performance for the latter. SoundSpaces
2.0 is publicly available to facilitate wider research for perceptual systems
that can both see and hear.Comment: Camera-ready version. Website: https://soundspaces.org. Project page:
https://vision.cs.utexas.edu/projects/soundspaces